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Abstract 

The study of complex networks has pursued an understanding of macroscopic behavior by focus¬ 
ing on power-laws in microscopic observables. Here, we uncover two universal fundamental physical 
principles that are at the basis of complex networks generation. These principles together predict 
the generic emergence of deviations from ideal power laws, which were previously discussed away 
by reference to the thermodynamic limit. Our approach proposes a paradigm shift in the physics 
of complex networks, toward the use of power-law deviations to infer meso-scale structure from 
macroscopic observations. 
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Introduction 


A recent seminal discovery elucidated that in nature a simple physical principle rules 
often the growth of ‘random networks’. The so called preferential attachment (‘the rich 
get richer’) rule leads to complex networks that have properties contrasting those predicted 
from classical random network theoryi^-. A fundamental universality principle of physics 
must be held responsible for this change of paradigm. The preferential attachment principle 
expresses in our interpretation that for the formation of ensembles, attractive forces that are 
generally valid over decades of spatial extensions are required (that in physics may involve 
mass, charge, e.g.). It is this principle that generates the celebrated power laws observed 
in the distribution of mesoscopic network indicators, such as network degree, connectivity 
weight^”-, or neuronal avalanche size^^^— . A second fundamental universality principle of 
physics is, however, active at the same time, that has passed unnoticed so far. It is the fact 
that real-world connectivity requires space, and that this space is limited. The question that 
we address in our work is what the traces of this principle will be, during network formation 
and regarding the hnal network. This question has not been answered so far. 

Generic network building algorithm 

To study this question, we consider a novel generic network building algorithm (our 
’primary model’) that implements both principles at the most basic level as follows. We 
start from a connected network of Nq nodes. With probability p, an ‘outside’ node, from a 
hnite set of available nodes, is added; alternatively, with probability 1 —p, an attempt is made 
to construct an ’inside’ edge (see below). If an outside node is added, the new node joins the 
network by m edges, where the target nodes are sampled according to their degree k (i.e. 
oc k), following preferential attachment. For an inside edge, two nodes are independently 
chosen along preferential attachment (i.e., proportional to the degree they have). If the two 
chosen nodes are not identical and not already connected, an edge is established. In this way, 
the algorithm’s second alternative expresses the second fundamental principle in terms of 
an ’edge saturation’ (at a level dehned by p and m, implemented right from the start of the 
network’s growth). The process stops if the set of available nodes is depleted. The algorithm 
generates undirected topological networks of arbitrary size, void of loops and multiple-edges; 
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examples will be discussed later. Fig. [T] shows the stereotypical degree distribution obtained 
in this way, exhibiting an extended power-law part of the distribution terminated by a hump 
(that, upon the network’s growth, moves towards larger degrees, until the process is stopped 
by node depletion, cf. Fig. [Tb). 



FIG. 1: Characteristic degree distributions from the two key principles (for different values of 
parameter p and fixed parameter m = 2; the effect of m is exhibited in Fig. [3] and Fig. 0]). 
Network size t = 10^ nodes, mean of 10^ realizations. Dashed lines: power-law visual guides. The 
effect is most saliently expressed for exponents < 2, occurring often in gene or protein networks. 


Network properties 

While we observe a wide-spread activity to hnd power-law distributions in all areas of 
physics, we emphasize that based on the fundamental ingredients necessary in the network 
building process, only in rare cases neat power laws will be found. Examples of experimental 
data with the deviations that our key principles predict are shown in Fig. [2l While our 
real-world examples are often related to biology (mostly because of the great availability 
of the underlying data, and because of the greater simplicity of the examples), all of our 
arguments are immediately transferable to physical situations where previous analysis has 
generally stopped at the preferential attachment level. Our analysis now provides guidelines 
for inferring from macroscopic measurements the microscopic properties that dominate net¬ 
work growth (cf. Fig. [3l where the ’bumpiness’ of the distribution P{k) was evaluated as 
the deviation from the power law p{k) excluding the hump, as {P{k) — p{k))/p{k)). This 
provides an important input for the modeling of real world systems (see, e.g., the Drosophila 
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network example discussed below). By superposition of prototypes with different p and m 
parameters, more general hump structures can be generated (Fig. [2]). This mechanism pro¬ 
vides an as yet unexplored link between the macro- and meso-scales that can be invaluable 
for both the modeling and the further analysis of real-world systems. 



FIG. 2: Typical weight and degree distributions, respectively, from experiments, and their quali¬ 
tative modeling (black: experimental, red: simulation data), a) Network of synchronizing linear 
phase oscillators (network weight distribution during synchronization)^, b) Gene family for S. 
cerevisiae^ (family size distribution). For the modeling, different (p,m)-models were superim¬ 
posed for a). 


exponent hump 



FIG. 3: Modeling guidelines: Phase diagram of the humped power law’s exponent and ’humpiness’ 
on local parameters {p,m) (see text). Domains of humpiness: I) not resolvable, II minor, III 
significant, IV salient. Guided by the power-law paradigm, investigations have mostly focused on 
examples from domains I and II. Network sizes: t = 10^. 
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In contrast to preferential attachment networks (cf.— ), a network generated along the 
two fnndamental physical principles embodied in our primary model, will not be necessarily 
sparse (this would imply a power-law exponent > 2 , cf. Fig. [T]). Moreover, also Dorogovtsev 
and Mendes’ modihed preferential attachment algorithm with its double regimes of power- 
law behavior- deviates from the fundamental principles that we have worked out. That 
model uses a second internal linking process that is always successful in making new connec¬ 
tions. In our case it is exactly the edge connection failures (by edge saturation) that dehne 
the network structure. Whereas the rate of internal linking in their algorithm accelerates 
with the network size, our approach does not share this property. Moreover, the network 
structures that we obtain depend primarily on parameter p and the obtained distributions 
are generally unaffected by the network’s initial condition (in contrast to Refs.— ^— ). 

The modeling of biological networks containing a small number of nodes only, is a par¬ 
ticular challenge. The example of Drosophilas’s courtship network, a network that is built 
on observable irreducible acts of body languagei^*^ (cf. Figs. |4]and|5]) illustrates that our 
approach also successfully masters this challenge (a further discussion of this example is 
given towards the end of the paper). 


Statistical modeling 


To better understand how the statistical properties and in particular, saturation, emerge 
from the model, we focus on a semi-analytical growth description, in which the natural time 
step t is the addition of one node to the network. The degree distribution from a network 
growth algorithm is usually determined from a differential equation that describes the rate 
of addition of new edges to a given node, as a function of the time s at which the node has 
joined the network—, i.e. = f{k,s,t). For our algorithm, the topological constraint 

on the addition of inside edges implies that can not be determined analytically from 

the single node information f{k,s,t), but requires the full pairwise connection information 
of the network encoded in the adjacency matrix at time t, At, i.e. 


dk{s, t) 
dt 


f{k,s,t,At). 


To work around this complication, we make the following ansatz. We suppose that the 
probability of failure while trying to add an inside edge {i,j) to an already chosen node i, 
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can be expressed by a mean field ‘satnration’ fnnction F{k, t) in terms of the degree k of 
node i. Fnrthermore, snppose that the total nnmber of edges present in the network at time 
t can be approximated by K{t). F{k, t) is then defined as the average probability of a node 
with degree k, to be already connected to a second node j chosen with P cc kj. Thns, 


F(k,t) 


( 1 ) 


where Fi(t) is the probability that node i with degree ki, is already connected to node j. 
Fi{t) has then the form 


F,{t) 




( 2 ) 


where ki(t) acconnts for the case where node i would be chosen twice, and the second term is 
the degree-weighted sum over the nodes to which node i is already connected {E{t) denotes 
the network’s set of edges). 

Using this approximation, we can express our algorithm by the rate of addition of new 



FIG. 4: a)-e) Choice of m on network degree distribution, for different values of p (network size 
t = 10^ nodes, mean of 10^ realizations). Increasing m for p << 1 increases the influence of 
the first term in Eq. (3), which increases the exponent by pushing the primary model towards 
the preferential attachment model, f) Real-world example; Drosophila courtship network’s degree 
distribution (corresponding to the full line in Fig. [5]). Degrees k < m have small probability. 
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FIG. 5: Drosophila courtship language network degree distribution, a) Survival function SF{k) := 
1 — CDF{k), where CDF is the cumulative distribution function (red dots: original data). Solid 
line: means, dashed lines: 0.05 quantiles, from 1000 realizations of our network growth algorithm 
{N = 34, p = ^, m = 2). Inset: mapped-out Drosophila language network. 


edges to a node of degree k{s,t) as 


dk{s, t) 
dt 


mk{s, t) 
2K{t) 


+ 


1 — p k{s, t) 


[l-F{k,t)]. 


(3) 


In this case, the network grows out from a connected network of Nq nodes, with k{s, s) ^ m 
as the initial condition. The hrst term on the right hand side of Eq. ([3]) describes the 
increase in k due to connection to outside nodes, and the second term describes the addition 
of inside edges. The whole equation has been rescaled by ^ (canceling the p in the first 
term’s numerator) such that t corresponds to the number of nodes in the network. As can 
be easily seen from Eq. ([2]), our growth algorithm provides two well-known limiting cases. 
For p = 1 we retrieve the preferential attachment growth process^. For p = 0, the network 
will not add nodes and must asymptotically become a clique of size Nq. In between, for 
p << 1, the second term dominates, which renders the network more dense, and produces 
the large deviation from power-law structure in the distribution tail. 

To demonstrate the validity of our mean-held approximation, we compare the node de¬ 
gree evolution obtained from a 4*^ order Runge-Kutta integration of Eq. ([3]) using our 
approximation for F{k,t) (see below), against the averaged result from 10^ realizations of 
the primary model. As the result, an approximate power law scaling clearly emerges at early 
evolution stage, and an upper bound to the envelope of node degrees emerges for longer evo¬ 
lution time t necessary to attain larger network sizes (cf. Fig. [6l where the results of the 
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semi-analytical description are based on exponents and prefactors from an approximation 
of the results of Fig. [7^) via Eq. (j3])). F{k,t) has a very regular behavior in both variables 



FIG. 6: Comparison: Primary model / semi-analytical description. Degree evolution k{s, t) of 
nodes entering the network at s = 21,41,81,161,321. Mean of 10^ primary model realizations 
(dashed), compared with numerical integration of Eq. ([3|) (solid). 


(fc,f) (Fig. [7^) and is accompanied by a node degree distribution P{k) as found for our 
primary model (Fig. [7 )d). Over a large range, we can approximate F{k,t) by a power law 
for small k, and by a second power law at large k: 

Fk^ ii k < kr 


F{k,t) 


k^^^ ifk>k, 

rZc 


(4) 


where kc ~ and the fractional term for k > k^ simply makes F{k,t) continuous at kc- 
The exponents A will vary according to the choice of algorithm parameter p, where 

0 < A < 1: i.e. 1 < kc < t. In accordance with Fig. [7^), the following observations can be 
made: First, 7 < /9 (the exponent of the power law £t decreases as k crosses kc). Second, 
F{t — l,t) = 1, since t — 1 is the maximum possible node degree at time t (achieved in 
Fig. [7K) for t = 25 only). Similarly, as p —)■ 0, F{k,t) —>■ 1, (the network will tend toward 
a clique, where all possible connections already exist). When p = 1, F{k,t) ceases to be 
relevant. Finally, for any p G (0,1), as t —> oo, F{k, t) —> 0, since the number of inside edges 
added at each time-step approximates a constant value, so the network becomes increasingly 
sparse. 

We can use F{k, t) to infer the generated unnormalized degree probability distribution, 
N{k, t) as follows. Starting from the continuity equation, we may write 


f) c) 

—N{k,t) = -—{N{k,t)—)+5^,k, 
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( 5 ) 




where || is given by Eq. ([3]), and the Kronecker delta fnnction has been inclnded to acconnt 
for the addition of ontside nodes. By differentiating Eq. ([3]), we notice that Eq. (|5]) contains 
the prodnct of k and the derivative of the saturation function F : 

d dk d 

= ao + ai - ai [k—F{k, t) + F(fc, t)) , (6) 

where Oq := 2 K{t) ’ form of F{k,t) implies that a sharp change should occur 

in the solutions of Eq. ([6]) around kc- Indeed, a comparison between P{k,t) and F{k,t) 
(Fig. [7]) supports this suggestion. Thus, we hold the properties of the saturation function 
F{k,t) responsible for the form of the deviation of P{k,t) from the ideal power law. 




FIG. 7: Relation between power-law deviation hump and saturation function: a) Mean field sat¬ 
uration F{k,t), b) mean of the degree distribution. Data set: 10^ network realizations for given 
time t using P = ^- Vertical grey lines are visual aids. The figure indicates the disappearance of 
the hump structure in the thermodynamic limit. 
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Discussion 


Examples of edge saturation network growth emerge from the fundamental situation 
where the state of a physical system is described by a symbol, and where time acting on 
the states leads to a description in terms of a language (symbolic dynamics and formal 
languages^^— , natural languages). Starting with a hnite number of Nq states, observations 
of the system in time yield sequences of states, that dehne links on a graph between nodes 
(states), which implies that more important or more versatile nodes will have more links. As 
such a network evolves for a hner description, two processes may occur: 1) adjacencies are 
established between previously unconnected nodes (preferentially between more versatile 
ones); 2) a new node is added and connected preferentially to already highly connected 
nodes. Evidently, in many networks there will, however, be a limitation on the number of 
edges that can be hosted by a given node. 

The Drosophila courtship body language of 37 fundamental behavioral states^^^^ and 
its network is an example of such a process. The states are fundamental in the sense that 
each act could, from the view of the physics of body motion, be followed by any other act. 
Some transitions, however, are generally not taken, leading to edges missing. Well-dehned 
connected sub-networks characterize a chosen courtship partner’s class, according to which 
protagonists can be distinguished (male, female (virgin, mature, mated), fruitless). Within 
these bounds, courtship exploits the available expression space, corroborating the view that 
it might advertise individual properties of the sender, into the eyes of a courtship partner—*^. 
To compare our network growth algorithm with the data from male-female interaction, we 
grow the network until the number of nodes (symbols) is depleted, with p chosen so that on 
average the number of edges matches that of the courtship network. A comparison -without 
further htting- exhibits that the two degree distributions match extremely well and that the 
proposed generating algorithm is very specihc (Fig. [5]). 

Our paradigm may also appear in the guise of an equilibrium condition in the following 
sense. Complex networks in physics or in biology are often constrained to maintain some 
’average’ conditions. As soon as (possibly: self-enhancing) node interaction sets in, this 
needs to be balanced by homeostasis, i.e. a competitive, counter-balancing mechanism 
that weakens other connections of the same node to the network-. In the neural networks 
domain, a closely related principle is known as ‘Hebbian learning’—. Self-organized Hebbian- 
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learning^ in the super-paramagnetio^ phase of ensembles has been proven a reliable and 
efficient way of clustering that does away with convexity requirements of cluster borders^S. A 
very similar approach has also been used as a synchronization model for coupled oscillators, 
where the oscillators’ struggle to synchronize is expressed by competing connection strengths 
Wij that evolve according to the dynamical update rule = Sij k)eE where 

Sij measures the pairwise oscillator synchrony. The resulting distribution of Wij has been 
shown to tend for intermediate coupling strengths towards a hump-terminated power-law 
(cf. Fig. |2^). This dynamical law expresses the limited resources available for the local 
wiring around each node, which in our model is encoded in the probability p ruling the edge 
saturation. We envisage that also avalanche distributions of the typical form of Fig. 2a) 
could be understood similarly^. 

Many interesting real-world phenomena dwell on the mesoscale. In social networks, the 
largest scale is relevant, e.g., for the study of disease and rumor spreading, but more subtle 
social dynamics happens within the community structures'*^. Our results suggest that a 
large class of systems can be formulated as growing along simple principles, similar and 
in addition to preferential attachment. The sets of m, p parameters needed to recover an 
experimental distribution, i.e. the violation of the ideal power law on the macroscopic scale, 
provides us with an insight about the local mesoscale structures present in the network. 
In this way, starting from non-ideal power law distributions of complex networks, an av¬ 
enue opens towards the identihcation and understanding of interesting mesoscale real-world 
phenomena in physics. 
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